Add a new stage to generate `zebin` to align CUDA stages in triton.compile #5189

chengjunlu · 2025-09-25T05:47:12Z

Add a new stage to generate zebin for XPU.

anmyachev · 2025-09-25T13:49:38Z

third_party/intel/backend/compiler.py

            stages["ttgir"] = lambda src, metadata: self.gluon_to_ttgir(src, metadata, options)
        stages["llir"] = lambda src, metadata: self.make_llir(src, metadata, options)
        stages["spv"] = lambda src, metadata: self.make_spv(src, metadata, options, self.device_arch)
+        stages["zebin"] = lambda src, metadata: self.make_zebin(src, metadata, options, self.device_arch)


We can't make this step mandatory yet (due to #5153 (comment)), but if we make it optional using options.generate_native_code, we can do a good refactoring right now.

I refactor the code with options.generate_native_code. Please help to review the changes.

Copilot

Pull Request Overview

This PR adds a new "zebin" compilation stage for XPU backend to align with CUDA compilation stages in triton.compile. The change introduces zebin as a binary format alternative to SPIRV for Intel XPU targets.

Adds make_zebin method to generate zebin binary format from SPIRV input
Updates binary extension from "spv" to "zebin" for XPU backend
Modifies compilation pipeline to handle zebin as a binary format alongside cubin and hsaco

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File	Description
third_party/intel/backend/compiler.py	Adds zebin compilation stage and updates binary extension
python/triton/compiler/compiler.py	Updates file parsing and compilation pipeline to support zebin format

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

third_party/intel/backend/compiler.py

python/triton/compiler/compiler.py

etiotto

Instead of using ocloc to generate the native binary, can we use L0 to generate it ?

How about trying: https://oneapi-src.github.io/level-zero-spec/level-zero/latest/core/PROG.html#module-caching-with-native-binaries

third_party/intel/backend/compiler.py

chengjunlu · 2025-10-29T04:29:52Z

Instead of using ocloc to generate the native binary, can we use L0 to generate it ?

How about trying: https://oneapi-src.github.io/level-zero-spec/level-zero/latest/core/PROG.html#module-caching-with-native-binaries

It L0 API requires passing the device context which is not avaliable during triton.compile context.

chengjunlu · 2025-10-29T07:13:57Z

third_party/intel/tools/intel/compile.cpp

  size_t global_range_y = {gridY};
  size_t global_range_z = {gridZ};
  size_t local_range_x = {num_warps} * {threads_per_warp};
-  if (driver_version.find("+") != std::string::npos) {{


This code doesn't make sense. Remove it.

… or option = {"generate_native_code": 1}. Signed-off-by: Lu,Chengjun <[email protected]>

chengjunlu requested review from anmyachev and whitneywhtsang September 25, 2025 05:47

chengjunlu changed the title ~~Add a new stage to generate zebin to align CUDA stages in triton.compile~~ [Draft] Add a new stage to generate zebin to align CUDA stages in triton.compile Sep 25, 2025

chengjunlu linked an issue Sep 25, 2025 that may be closed by this pull request

[Pytorch upstream] Feature request: Save SPIR-V Build flag to CompiledKernel metadata for Inductor. #5153

Open

chengjunlu force-pushed the chengjun/add_zebin_stage branch from 4cba65d to a943a26 Compare September 25, 2025 06:00

anmyachev reviewed Sep 25, 2025

View reviewed changes

etiotto requested a review from Copilot September 25, 2025 14:58

Copilot AI reviewed Sep 25, 2025

View reviewed changes

third_party/intel/backend/compiler.py Outdated Show resolved Hide resolved

third_party/intel/backend/compiler.py Show resolved Hide resolved

python/triton/compiler/compiler.py Outdated Show resolved Hide resolved

etiotto reviewed Sep 25, 2025

View reviewed changes

third_party/intel/backend/compiler.py Outdated Show resolved Hide resolved

third_party/intel/backend/compiler.py Outdated Show resolved Hide resolved

etiotto marked this pull request as draft October 9, 2025 14:10

This was referenced Oct 17, 2025

[Pytorch upstream] Feature request: Save SPIR-V Build flag to CompiledKernel metadata for Inductor. #5153

Open

Generate native code using L0 sdk instead of ocloc #5342

Closed

chengjunlu force-pushed the chengjun/add_zebin_stage branch from a943a26 to c7cbf86 Compare October 29, 2025 03:32

chengjunlu marked this pull request as ready for review October 29, 2025 03:32

chengjunlu linked an issue Oct 29, 2025 that may be closed by this pull request

Binary kernel for Inductor static kernel launcher. #5388

Open

chengjunlu mentioned this pull request Oct 29, 2025

Binary kernel for Inductor static kernel launcher. #5388

Open

chengjunlu force-pushed the chengjun/add_zebin_stage branch from c7cbf86 to f2186c2 Compare October 29, 2025 03:50

chengjunlu changed the title ~~[Draft] Add a new stage to generate zebin to align CUDA stages in triton.compile~~ Add a new stage to generate zebin to align CUDA stages in triton.compile Oct 29, 2025

chengjunlu force-pushed the chengjun/add_zebin_stage branch from f2186c2 to dabaee1 Compare October 29, 2025 04:28

chengjunlu force-pushed the chengjun/add_zebin_stage branch 2 times, most recently from bbd3669 to b66f0b5 Compare October 29, 2025 07:13

chengjunlu commented Oct 29, 2025

View reviewed changes

Add a new stage to generate zebin when TRITON_XPU_GEN_NATIVE_CODE=1…

b66f0b5

… or option = {"generate_native_code": 1}. Signed-off-by: Lu,Chengjun <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Add a new stage to generate `zebin` to align CUDA stages in triton.compile #5189

Add a new stage to generate `zebin` to align CUDA stages in triton.compile #5189

chengjunlu commented Sep 25, 2025

Uh oh!

anmyachev Sep 25, 2025

Uh oh!

chengjunlu Oct 29, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

etiotto left a comment

Uh oh!

Uh oh!

Uh oh!

chengjunlu commented Oct 29, 2025

Uh oh!

chengjunlu Oct 29, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

Add a new stage to generate zebin to align CUDA stages in triton.compile #5189

Are you sure you want to change the base?

Add a new stage to generate zebin to align CUDA stages in triton.compile #5189

Conversation

chengjunlu commented Sep 25, 2025

Uh oh!

anmyachev Sep 25, 2025

Choose a reason for hiding this comment

Uh oh!

chengjunlu Oct 29, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

etiotto left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

chengjunlu commented Oct 29, 2025

Uh oh!

chengjunlu Oct 29, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Add a new stage to generate `zebin` to align CUDA stages in triton.compile #5189

Add a new stage to generate `zebin` to align CUDA stages in triton.compile #5189